Spontaneous Speech Understanding for Robust Multi-Modal Human-Robot Communication
نویسندگان
چکیده
This paper presents a speech understanding component for enabling robust situated human-robot communication. The aim is to gain semantic interpretations of utterances that serve as a basis for multi-modal dialog management also in cases where the recognized word-stream is not grammatically correct. For the understanding process, we designed semantic processable units, which are adapted to the domain of situated communication. Our framework supports the specific characteristics of spontaneous speech used in combination with gestures in a real world scenario. It also provides information about the dialog acts. Finally, we present a processing mechanism using these concept structures to generate the most likely semantic interpretation of the utterances and to evaluate the interpretation with respect to semantic coherence.
منابع مشابه
Robust methods in automatic speech recognition and understanding
This paper overviews robust architecture and modeling techniques for automatic speech recognition and understanding. The topics include robust acoustic and language modeling for spontaneous speech recognition, unsupervised adaptation of acoustic and language models, robust architecture for spoken dialogue systems, multi-modal speech recognition, and speech understanding. This paper also discuss...
متن کاملIconic Gestures for Robot Avatars, Recognition and Integration with Speech
Co-verbal gestures are an important part of human communication, improving its efficiency and efficacy for information conveyance. One possible means by which such multi-modal communication might be realized remotely is through the use of a tele-operated humanoid robot avatar. Such avatars have been previously shown to enhance social presence and operator salience. We present a motion tracking ...
متن کاملLearning issues in a multi-modal robot-instruction scenario
One of the challenges for the realization of future intelligent robots is to design architectures which make user instruction of work tasks by interactive demonstration effective and convenient. A key prerequisite for enhancement of robot learning beyond the level of low-level skill acquisition is situated multi-modal communication. Currently, most existing robot platforms still have to advance...
متن کاملSituated robot learning for multi-modal instruction and imitation of grasping
A key prerequisite to make user instruction of work tasks by interactive demonstration effective and convenient is situated multi-modal interaction aiming at an enhancement of robot learning beyond simple low-level skill acquisition. We report the status of the Bielefeld GRAVIS-robot system that combines visual attention and gestural instruction with an intelligent interface for speech recognit...
متن کاملCorpus-Based Training of Action-Specific Language Models
Especially in noisy environments like in human-robot interaction, visual information provides a strong cue facilitating a robust understanding of speech. In this paper, we consider the dynamic visual context of actions perceived by a camera. Based on an annotated multi-modal corpus of people who verbally explain tasks while they perform them, we present an automatic strategy for learning action...
متن کامل